System and method for transposing web content

ABSTRACT

Provided are a system and method for enhancing static web content. In one example, the method includes extracting text content describing an item and extracting still images of the item from a host website, automatically converting the extracted text content into audio by combining keywords from the extracted text content with auto-generated supplemental words related to the item to generate an audio script, automatically converting the extracted still images into moving images by arranging the still images extracted in a sequence and adding movement to the still images to generate a video, and simultaneously playing the automatically generated audio script and the automatically generated video in response to a selection of the item. By creating and overlapping video and audio from still images and text of a listing on a web site, the listing becomes more entertaining and captivating to a viewer.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.15/726,687, filed on Oct. 6, 2017, which claims the benefit of USProvisional Patent Application No. 62/405,328, filed on Oct. 7, 2016,the entire disclosures of which are incorporated herein by reference forall purposes.

BACKGROUND

The World Wide Web is a rich environment that includes web pages, blogs,news, wikis, social networking sites, research services, media types,and more. Web content is what draws a reader interest causing the readerto view a web page and it is also what can keep the attention of thereader. Web content may include various forms such as text, animation,images, video, sound, and the like. Of these types of content, textualweb content and still images and static text content can be of the leastinterest to the reader. Text content typically includes written wordswhile still images include photographs that have been converted todigital from, without the enhancement of sound, video or animation.Therefore, textual web content and still images can struggle to attractand attain readers for much longer than a few seconds before the readermoves away from the page.

While static content such as still images and digital text may be thepreferred method for providing information about items on the Web, forexample, products, job opportunities, services, and the like, in anonline viewing environment, the static content requires the reader to doall the work. For example, an online merchant website may have many webpages devoted to listing products and services available for purchase.The listings often include still images of the product along withwritten description about the product which provides a reader withvarious details such as price, warranty, availability, location, and thelike. In this case, the reader must scan through and find relevant textof interest and separately scroll through the images using commands.Furthermore, a reader may have to click on and view multiple pages oftextual data and still images to gain a comprehensive understanding ofthe item they are viewing. Therefore, what is needed is a technologythat improves a reader experience when interacting with the web andprovides the reader a comprehensive understanding of an item throughminimal effort on the part of the reader.

SUMMARY

In one general aspect, provided is a computer-implemented method thatmay include at least one of extracting text content describing an itemand extracting still images of the item from a host web site thatincludes listings of a plurality of items, automatically converting theextracted text content into audio by combining keywords from theextracted text content of the item from the host web site withauto-generated supplemental words related to the item to generate anaudio script, automatically converting the extracted still images fromthe host web site into moving images by arranging the still imagesextracted from the website in a sequence and adding movement to thestill images to generate a video, and simultaneously playing theautomatically generated audio script and the automatically generatedvideo in response to a selection of the item.

In another general aspect, provided is a computing system that mayinclude at least one of a network interface configured to receive website data from a host web site including images and a descriptionassociated with an item listed on the host website, a processorconfigured to extract text content describing the item and the stillimages of the item from the received website data, automatically convertthe extracted text content into audio by combining keywords from theextracted text content of the item from the host web site withauto-generated supplemental words related to the item to generate anaudio script, and automatically convert the extracted still images fromthe host website into moving images by arranging the still imagesextracted from the web site in a sequence and adding movement to thestill images to generate a video, and an output configured tosimultaneously play the automatically generated audio script and theautomatically generated video in response to a selection of the item.

Other features and aspects may be apparent from the following detaileddescription, the drawings, and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

Features and advantages of the example embodiments, and the manner inwhich the same are accomplished, will become more readily apparent withreference to the following detailed description taken in conjunctionwith the accompanying drawings.

FIG. 1 is a diagram illustrating a system for enhancing static webcontent in accordance with an example embodiment.

FIG. 2 is a diagram illustrating a process of outputting an enhanceditem listing via a website in accordance with an example embodiment.

FIG. 3 is a diagram illustrating a process of extracting text contentand generating an audio script in accordance with an example embodiment.

FIG. 4 is a diagram illustrating a process of extracting still imagesand generating a video in accordance with an example embodiment.

FIG. 5 is a diagram illustrating a method for enhancing static webcontent in accordance with an example embodiment.

FIG. 6 is a diagram illustrating a computing system for enhancing staticweb content in accordance with an example embodiment.

Throughout the drawings and the detailed description, unless otherwisedescribed, the same drawing reference numerals will be understood torefer to the same elements, features, and structures. The relative sizeand depiction of these elements may be exaggerated or adjusted forclarity, illustration, and/or convenience.

DETAILED DESCRIPTION

In the following description, specific details are set forth in order toprovide a thorough understanding of the various example embodiments. Itshould be appreciated that various modifications to the embodiments willbe readily apparent to those skilled in the art, and the genericprinciples defined herein may be applied to other embodiments andapplications without departing from the spirit and scope of theinvention. Moreover, in the following description, numerous details areset forth for the purpose of explanation. However, one of ordinary skillin the art should understand that embodiments may be practiced withoutthe use of these specific details. In other instances, well-knownstructures and processes are not shown or described in order not toobscure the description with unnecessary detail. Thus, the presentdisclosure is not intended to be limited to the embodiments shown, butis to be accorded the widest scope consistent with the principles andfeatures disclosed herein.

Compelling online content is critical for attracting web visitors.Static content is typically the least interesting for a reader incomparison to video, audio, animation, virtual reality, and the like.One type of static content is still images or photographs. An imagetypically includes a picture of an object, a person, or an environment.However, after an initial glance, a viewer's concentration and focus maybe lost or distracted. As a result, the viewer may move to a next pageor site, or another area on the page, after only a few seconds. Anothertype of static content is textual content which may include adescription, a list of technical specifications, price, availability,location information, and the like, within a web page, a blog, an onlinepublication, or the like. Textual web content often includes adescription or other writing composed of words, sentences, paragraphs,etc., which describes a subject matter such as an item or topics relatedto an item. Textual web content requires a user to read the text andcomprehend the text. When both text and images are used to provideinformation about an item, the user must view both the images and thetext (often via multiple pages) to gather a full understanding of theitem.

The example embodiments provide a web-based software (e.g., program,application, service, etc.) system that extracts static content (e.g.,still images and text) from a web page related to an item and convertsthe static content into an audio and a video that can be simultaneouslyoutput in place of the static content. The system may enhance webcontent by extracting still images and text-based data from a hostwebsite and convert the extracted data into audio and video therebyimproving viewer interest. The enhanced content may be output on a samesite or via a different site such as a search site. The system mayextract still images that are displayed within a listing associated withan item or a product for purchase and generate a video using a programthat moves the images. The still images may be ordered into a sequenceand the images may be moved within a video player or other window usingpanning, zooming, expanding, shrinking, and the like, in the form of avideo. In some embodiments, the still images may be extracted frommultiple web pages and combined into a video that is played on a singlepage thereby relieving the viewer from having to visit different webpages. Prior to generating the sequence of images, the software maycondense the images by discarding one or more images from the videogeneration based on an image quality, a duplication with another image,a cut-off image, an image not being associated with the item, and thelike.

The system may also enhance web content by extracting textual contentfrom the item listing such as a description of a product, technicaldetails about a product, user reviews of the product/property,geographical location information, contact information, price,availability, and the like, and convert the extracted text content intoan audio script. The system may extract entire sections of text contentor it may condense the textual content by extracting only keywords,sentences, particular sections, or the like, of the text content. Inaddition, the system may combine text content from multiple sections ofa website and combine the text content with supplemental content notlisted on the website to generate an audio script that flows smoothlyfor a listener and improves user experience. For example, keywords andother text content may be extracted from multiple different web pagesand combined into a single audio script with additional supplementalcontent related to the item included in the listing (e.g., job,property, service, product, etc.) thereby relieving the listener fromhaving to visit different web pages and peruse both images anddescription separately.

Both the still image content and the text content may be condensed toprovide a more succinct description and visual representation of theitem. For example, keywords or sentences may be extracted from a largerbody of textual content or multiple sections of textual content. Inaddition, still images may be extracted while other images may beremoved or discarded prior to generating the video. Furthermore, thesystem may overlap a playing of the generated video and a playing of thegenerated audio to simultaneously output the audio and video to providethe user with both sources of content at the same time. Accordingly,when a viewer/listener selects a listing on a website such as a propertylisting, instead of having to separately view still images and textdescription (often on multiple web pages), the user may receive acombination of video and audio which are transposed from the stillimages and text description and which are played on a same web pagethereby significantly improving the user experience.

For purposes of description, various examples herein are described withrespect to a property listing such as a travel-related website, homebuying website, hotel website, or the like. However, the embodimentsherein are not limited to property listings and may be applicable to alltypes of listings such as job postings, automobile sales, restaurantlistings, service listings, or any other item that can be posted via theweb and include a format that includes images and text description. Thatis, the embodiments herein may be applied to any web-based post thatincludes textual description and still images which can be transposedinto audio and video content. Also, the examples herein are describedwith respect to static content being extracted from a host website(i.e., a first website) and converted into enhanced content on a secondwebsite. However, the original content and the enhanced content may beextracted and output via a single website rather than two or morewebsites.

FIG. 1 illustrates a system 100 for generating enhanced web content inaccordance with an example embodiment. Referring to FIG. 1, the system100 includes a user device 110, a content modification server 120, and ahost server 130. In this example, the host server 130 may host variouslistings of items such as job listings, restaurant listings, propertylistings, and the like. A listing described herein may include any postrelated to an item including at least one of images and descriptioncontent. The content modification server 120 may be a host server of asecond website such as a search site, travel comparison site or the likein which content from multiple websites is aggregated to provide a userwith a comprehensive search listing for an item such as a job, rentalproperty, service, restaurant, or the like. The user device 110 may be asmart phone, a tablet, a computer, a kiosk, an appliance, and the like.The user device 110, the content modification server 120, and the hostserver 130 may be connected to each other via a network such as theInternet, a private network, or the like.

The user device may have installed therein a web browser which displaysa user interface including a window associated with the web browser. Theuser device may run the web browser and a user thereof may input a webaddress of a website, for example, a website hosted by the contentmodification server 120, the host server 130, or the like. For example,the website may be a travel related website that includes rentalproperty listings (e.g., hotels, vacation rentals, boats, and the like).As another example, the website may be any merchant related website withitems (e.g., products and/or services) for sale posted or indicatedthere within. As another example, the website may be a job postingssite, a restaurant listings site, or the like. The user device 110 mayselect a web page of a website provided by either of the contentmodification server 120 and/or the host server 130.

According to various embodiments, static content 132 of a web pagehosted by the host server 130 may be extracted by the contentmodification server 120 and converted into more appealing content havinga form of audio and video. For example, the content modification server120 may obtain data from the host server 130 at intervals of time andstore data related to postings and listings hosted by the host servers130. Here, the content modification server 120 may convert the textcontent and still imagery from the host servers 130 into audio and videoand store the audio and video such that the audio and video content isautomatically provided to the user device 110 when the user device 110requests listing content via the website hosted by the contentmodification server 120. As another example, the content modificationserver 120 may receive a request from a user via the website hosted bythe content modification server 120 and in response, automaticallyextract content from a host website hosted by the host server 130,convert the content into audio and video, and output the audio and videoin real-time. For example, the content modification server 120 mayinclude a local cache where content extracted from host server 130 canbe temporarily stored, converted into audio and video, and output to theuser device 110.

In some embodiments, the content modification server 120 may be a webserver that auto-detects static web content 132 from various sources onthe web such as websites, web pages, databases, multiple host servers130, and the like. In some examples, the detected web content may berelated to a particular industry or topic (e.g., travel, jobopportunities, services, home buying, etc.) As a non-limiting example,the static content 132 may be information related to information (e.g.,vacation home rental, hotel, and the like), rental car information, andthe like. As another example, the static content 132 may be productrelated information (e.g., shoes, clothing, equipment, tools,appliances, devices, etc.), financial related information, sportsrelated information, activity related information, and the like. Toauto-detect the web content, the content modification server 120 mayperform a crawl of one or more web sources (e.g., web sites, databases,etc.) on a periodic basis, for example, daily, hourly, weekly, and thelike, and store the web content detected therefrom in a local databaseor an external database connected thereto.

In one example, a website provided by the host server 130 includesstatic content of a plurality of accommodation listings (e.g., vacationrentals, hotels, and the like). Each listing may include one or moreimages of a real property associated with the accommodation listing, andtext content such as user reviews, descriptions of the property, starratings, information about a geographical area at which the property islocated, events at the property, contact information, availability,pricing, and the like. However, static content 132 is not the mostinteresting content. In addition, a user may have to expend time andeffort to identify all of the textual content associated with anaccommodation listing when the text content is dispersed throughoutdifferent tabs or on multiple pages. In this case, it may be difficultto ascertain desired information from a webpage without scrolling up anddown and moving between web pages of the website. Furthermore, the textcontent may include small print which is difficult to see/comprehend.

According to various embodiments, the content modification server 120may enhance the static content 132 from the website provided by the hostserver 130 by extracting text information and still images andgenerating audio and video content 122 associated with the accommodationlisting. As shown in FIG. 1, the host server 130 may initially provide astatic listing that includes still images and text related to anaccommodation listing. The content modification server 120 may extractthe initial web content and generate audio and video content 122associated with the accommodation listing, and provide the audio andvideo content 122 to the user device 110 when the user requestsinformation about the listing. The audio content may be generated basedon a script or a predefined template including one or more keywords thatare of interest. As another example, the audio may be generateddifferently based on a particular type of the item. For example, a boatmay have a different audio content than a laundry machine, and the like.Meanwhile, the video content may be generated by arranging the stillimages in a sequence and moving the images (e.g., panning and zooming)within the frame of a video player.

FIG. 2 illustrates a process 200 of outputting an enhanced item listingvia a website in accordance with an example embodiment. In this example,the item listing is a property listing such as a listing for a vacationrental, hotel, home purchase/rental, or the like. The original data maybe hosted on a host website 210 and may include static content such as aplurality of images 211, property details and characteristics 212 and aproperty description 213. Although not shown, the host website 210 mayinclude additional information about the property such as user reviews,contact information, and the like. Meanwhile, an enhancement site 220may extract data from host website 210 (as well as other host websites)and generate enhanced content of the static content. For example, theenhancement website 220 may convert static images 211 into a video 222and convert the text content (212, 213, etc.) into an audio 224 andsimultaneously output the converted video and audio via the enhancementwebsite 220.

According to various aspects, a software program and system may extractstatic content from individual pages (e.g., property page on hostwebsite 210) and turn the static content (verbiage and images) and intoa video 222 and an audio 224 about the property listing. The staticcontent may be stored in a database and converted at a later time. Asanother example, the static content may be converted before it is storedin the database and accessed by the enhancement site. The software maybe stored and executed on a same device that hosts the enhancement site220 or it may be stored and executed by a remote device that is incommunication with the enhancement site 220 via a network. The softwareand system may crawl the Internet, databases, resources, and the like,and grab the content form multiple sites and store the content into adatabase. As another example, the software may grab the content“on-the-fly” and temporarily store the static content in a temporarymemory such as a local cache, convert the static content intomoving/audio content, and output the converted content via theenhancement site 220, without having to pre-store the content inadvance.

The conversion described herein may be triggered by a user entering theenhancement website 220 and navigating to a property page and viewing alink which is live on the host website 210. The enhancement site 220 maybe a search site or the like which can be used to aggregate oraccumulate item search results from multiple sites and provide acomparison of the item content from multiple different host sites.Rather than display the static content live on the host website 210, theenhancement site 220 may convert static image content 211 from the hostwebsite 210 into video content 222. In addition, the text content thatis provided at the bottom of the host website 210, may be extracted andput into a template where it is read as the audio 224 along with thevideo 222.

According to various aspects, the static images 211 may beauto-transposed into the video 222 and the static text content (e.g., atthe bottom of the page, on a second page, etc.) may be auto-convertedinto an audio script, converted into speech, and read as an audio 224while the video is simultaneously playing. The video content 222 and theaudio content 224 may be overlapped with one another such that they eachplay at the same time. In some embodiments, the video 222 is a movingslideshow. In this example, the audio 224 may be overlapped with thevideo while the video is fading in and out between image and image.

Prior to the embodiments described herein, in order to have contentconverted to video and audio a user would have to download a tool andinstall the tool. Then, the user would have to download images and runthe tool to convert the images into video. The user would also need toperform the same actions for audio but first they would have to find thedescription and save it into the tool to convert it. The user would thensomehow have to link the two together. The example embodimentssignificantly improve upon this process by automatically extract staticcontent (images and text) read from a web page and converts the staticcontent automatically into video and audio without requiring any userinteraction. Furthermore, the system can discard images that are fuzzy,cutoff, poor quality, deduplication between other images. Also, thesystem can enhance the textual description by supplementing the textualdescription with additional language. For example, the system mayextract keywords or other language from the web page and add that to thedescription to generate an audio. As another example, the system mayinsert the words into a template that includes additional wording basedon a designer preference.

FIG. 3 illustrates a process 300 of extracting text content from a hostwebsite and generating an audio script in accordance with an exampleembodiment. In this example, the host website includes a plurality ofweb pages including a first web page 310 and a second web page 320.Meanwhile, textual content from various sections of both the first webpage 310 and the second web page 320 which are related to a common item(e.g., a property listing) may be extracted and added to an audio script330 which may be read by an enhancement website 330. The text contentmay include information about an item such as a hotel or vacation rentalat a geographical destination including one or more of a name of theproperty, rental prices, geographical location information of theproperty, amenities of the property, descriptions of the property,reviews of the property, star ratings of the property, and the like.Here, the audio script includes blank spaces or openings that are to befilled-in with text content from various sections of the propertylisting shown on the host website 310. For example, keywords may beextracted from text content and inserted into the blanks of audio script330 or entire passages of text content may be inserted into the blanksof the audio script 330. In this example, the host website includes textsections related to an availability of the property, a price range, anumber of rooms, a geographical location, a user rating, contactdetails, and the like, on a first web page, and user reviews on a secondweb page 320. Text content may be extracted from sections of both thefirst web page 310 and the second web page 320 and added to the audioscript 330.

Furthermore, the audio script 330 may be based on a template thatchanges based on a type of listing. For example, the template may bedifferent if the item is a job opportunity instead of a propertylisting. As another example, the template may be different if thelisting is related to a service or a restaurant, rather than a propertylisting. The template is not limited to a particular style. The templateitself may include additional description that is related to the item ina general manner. However, when the template is combined with textualcontent from the host website, the audio script 330 becomes directedtowards specific details of the item listing from the host website. Inaddition to text content, the audio script 330 may include music,celebrity voices, computerized voices, sounds, and the like, which maybe used based on user preferences, a geographical location of theproperty, a time of year, and the like.

FIG. 4 illustrates a process 400 of extracting still images 411 from ahost website 410 and generating a video 432 which is played on anenhancement site 430, in accordance with an example embodiment. In thisexample, a plurality of still images (e.g., images 1-5) may be extractedfrom the host website 410 and added to a video being played within avideo player of enhancement site 430. Here, a sequence of images 420 areextracted and converted into a video for the enhancement site 430.During the extraction and conversion process, one or more images fromthe sequence of images 420 may be discarded such as image 422. The imagemay be discarded based on one or more factors such as poor quality,duplicate, and the like. Also, the images may be ordered in a sequencebased on one or more factors provided by the enhancement site 430 orrandomly. The sequence may be equivalent to the sequence in which theimages are stored on the host website 410, or it may be modified toillustrate certain components or features of the property (or otheritem) first. For example, the images may be displayed such that roomimages are shown before pool images, and the like.

The system may generate the video 432 based on the still images 411, forexample, using a program that moves the images, overlays text (a.k.a.“video titles”), and adds an audio track combining music and acomputerized text-to-speech voice synthesizer that reads the extractedtext. The images may be moved by panning, zooming, expanding, decreasingin size, and the like, within the video player of the enhancement site430.

According to various aspects, the system herein may generate audio/videoabout an accommodation listing that is more entertaining for a viewerand which is capable of providing information about the accommodationlisting to the viewer using audio and video based on data extracted frommultiple web pages without requiring the viewer to identify theinformation from text content on the web pages. For example, the systemmay generate a more interesting video and audio of the item thatcaptures relevant content while excluding less relevant content therebycondensing information provided from a host website.

FIG. 5 illustrates a method 500 for enhancing static web content inaccordance with an example embodiment. For example, the method 500 maybe performed by the modification server 120 shown in FIG. 1, a crawlserver, a web server, a host server, a user device, a cloud platform, ora combination thereof. Referring to FIG. 5, in 510 the method includesextracting text content describing an item and extracting still imagesof the item from a host website that includes listings of a plurality ofitems. For example, the text content and the still images may bereceived from a host website which is hosting a listing of the item forsale or viewing. The still images may include a plurality of imagestaken of or about the item and may be located on a single web page ordisposed on multiple web pages. For example, a thumbnail image may bedisposed on a first web page of the host website and an expanded imagemay be shown on a second web page of the host website when the thumbnailis selected on the first web page of the host website. The descriptionof the item may include a description of the product, property, service,job opportunity, and the like. The description may be included inmultiple parts of the website and on multiple pages of the host website.For example, a first web page may include a description of an item forsale such as parts, uses, location, availability, and the like, and asecond web page may include user reviews describing the item.

In 520, the method includes automatically converting the extracted textcontent into audio by combining keywords from the extracted text contentof the item from the host website with auto-generated supplemental wordsrelated to the item to generate an audio script. Furthermore, in 530 themethod includes automatically converting the extracted still images fromthe host website into moving images by arranging the still imagesextracted from the website in a sequence and adding movement to thestill images to generate a video. It should be appreciated that theconverting to video in 520 and the converting to audio in 530 may beperformed simultaneously or at separate times in either order. Theconverting may be performed in response to a viewer selecting a listingassociated with the item on a second website which provides listingsfrom an aggregation of sites. For example, the second website may be atravel website providing travel-related listings from a combination ofsource sites. In this case, the second website may monitor/crawl thesource/host websites on regular (periodic, random) basis. The secondwebsite may display listings corresponding to the host websites howeverthe listings on the second website may be enhanced to include audio andvideo instead of still images and text content.

In some embodiments, the item may include a property listing on at leastone of a rental site, a travel site, and an accommodation site, which iscombined with property listings from multiple sites within a cumulativesite. In this example, the automatically converting of the extractedtext content into audio may include extracting keywords from at leasttwo of a description of the property, a user review of the property, anda geographic location of the property, and inserting the extractedkeywords into a template which includes supplemental words related tothe property to generate the audio script. Here, the keywords may beextracted from different sections of a web page and from different webpages of the host website and combined into a single audio script thatmay be read to a listener via a video player on a single web page. Thetemplate may include supplemental content that is related to a subjectmatter of the item listed. For example, the supplemental content mayinclude content that enhances a description of the item by addinglanguage including words, sentences, and phrases before, between, andafter keywords extracted from the text content of the host website.

In some embodiments, the automatically converting of the extracted stillimages may include removing one or more still images extracted from thehost website prior to generating the video. For example, the softwareapplication described herein may analyze a still image and determinewhether the image is of the correct item/subject matter, whether theimage is of a predetermined quality, whether the image has been cutoff,and the like, and discard any images that do not meet certaincharacteristics. As another example, the automatically converting theextracted still images from the host website into the moving images mayinclude adding at least one of panning and zooming to each of the stillimages to generate the video.

In 540, the method includes simultaneously playing the automaticallygenerated audio script and the automatically generated video in responseto a selection of the item. For example, the audio script may include afirst duration and the video comprises a second duration that isdifferent than the first duration, and the simultaneously playing mayinclude overlapping a playing of the audio and a playing of the videobased on a shorter duration from among the first and second durations.For example, if the audio is the shorter duration, an entire portion ofthe audio can be overlapped by the video while only a portion of thevideo is overlapped by audio. The video may be generated by zooming andpanning on each of the still images, for example, to create a Ken Burnsstyle video, or the like.

FIG. 6 illustrates a computing system 600 for enhancing static webcontent in accordance with an example embodiment. For example, thecomputing system 600 may be a database, cloud platform, streamingplatform, and the like. As a non-limiting example, the computing system600 may be content modification server 120 shown in FIG. 1. In someembodiments, the computing system 600 may be distributed across multipledevices. Also, the computing system 600 may perform the method 500 ofFIG. 5. Referring to FIG. 6, the computing system 600 includes a networkinterface 610, a processor 620, an output 630, and a storage device 640such as a memory. Although not shown in FIG. 6, the computing system 600may include other components such as a display, an input unit, areceiver, a transmitter, a text-to-speech converter, and the like.

The network interface 610 may transmit and receive data over a networksuch as the Internet, a private network, a public network, and the like.The network interface 610 may be a wireless interface, a wiredinterface, or a combination thereof. The processor 620 may include oneor more processing devices each including one or more processing cores.In some examples, the processor 620 is a multicore processor or aplurality of multicore processors. Also, the processor 620 may be fixedor it may be reconfigurable. The output 630 may output data to anembedded display of the computing system 600, an externally connecteddisplay, a display connected to the cloud, another device, and the like.The storage device 640 is not limited to a particular storage device andmay include any known memory device such as RAM, ROM, hard disk, and thelike, and may or may not be included within the cloud environment. Thestorage 640 may store software modules or other instructions which canbe executed by the processor 620 to perform the method 500 shown in FIG.5.

According to various embodiments, the network interface 610 may receivewebsite data from a host website including images and a descriptionassociated with an item listed on the host website. The item may includea real property, a product, a service, a job opportunity, and the like.The processor 620 may extract text content describing the item and stillimages of the item from the received website data. Furthermore, theprocessor 620 may automatically convert the extracted text content intoan audio file or files by combining keywords from the extracted textcontent related to the item from the host website with auto-generatedsupplemental words related to the item to generate an audio script. Inaddition, the processor 620 may automatically convert the extractedstill images from the host website into moving images by arranging thestill images extracted from the website in a sequence and addingmovement to the still images to generate a video. Furthermore, theoutput 630 may simultaneously play the automatically generated audioscript and the automatically generated video in response to a selectionof the item.

According to various embodiments, the content enhancement softwaredescribed herein may be performed by a web server that auto-detects,extracts, and stores content from around the World Wide Web. Forexample, the web server may auto-detect vacation travel information froma plurality of travel-related websites on the web such as vacationrental websites, hotel rental websites, flight websites, and the like,including vacation rental housing accommodations, sight-seeinginformation, attraction information, flight information, and the like.As yet another example, the web server may auto-detect information aboutproducts, for example, shoes, clothing, materials, consumer goods,furniture, appliances, and the like It should be appreciated that theembodiments are not limited to particular industry or a particular typeof item or accommodation.

The example embodiments are directed to enhancing web content associatedwith an item. The system may extract text content from the web contentand generate an audio script which describes the item and which includessupplemental description related to a type of the item. Furthermore, thesystem may convert the audio script from written text to audio using atext-to-speech converter. The system may also generate video of the itemusing one or more still images extracted from the web content. Forexample, the video may including zooming and panning (e.g., Ken BurnsStyle) of modifying the still images to create more interesting visualcontent. The generated audio and video may be combined, and timed toplay at the same time such that the audio is relevant in time to what isbeing shown on via the video. Accordingly, web content may be enhanced.

As will be appreciated based on the foregoing specification, theabove-described examples of the disclosure may be implemented usingcomputer programming or engineering techniques including computersoftware, firmware, hardware or any combination or subset thereof. Anysuch resulting program, having computer-readable code, may be embodiedor provided within one or more non transitory computer-readable media,thereby making a computer program product, i.e., an article ofmanufacture, according to the discussed examples of the disclosure. Forexample, the non-transitory computer-readable media may be, but is notlimited to, a fixed drive, diskette, optical disk, magnetic tape, flashmemory, semiconductor memory such as read-only memory (ROM), and/or anytransmitting/receiving medium such as cloud storage, the Internet ofThings, or other communication network or link. The article ofmanufacture containing the computer code may be made and/or used byexecuting the code directly from one medium, by copying the code fromone medium to another medium, or by transmitting the code over anetwork.

The computer programs (also referred to as programs, software, softwareapplications, “apps”, or code) may include machine instructions for aprogrammable processor, and may be implemented in a high-levelprocedural and/or object-oriented programming language, and/or inassembly/machine language. As used herein, the terms “machine-readablemedium” and “computer-readable medium” refer to any computer programproduct, apparatus, cloud storage, internet of things, and/or device(e.g., magnetic discs, optical disks, memory, programmable logic devices(PLDs)) used to provide machine instructions and/or data to aprogrammable processor, including a machine-readable medium thatreceives machine instructions as a machine-readable signal. The“machine-readable medium” and “computer-readable medium,” however, donot include transitory signals. The term “machine-readable signal”refers to any signal that may be used to provide machine instructionsand/or any other kind of data to a programmable processor.

The above descriptions and illustrations of processes herein should notbe considered to imply a fixed order for performing the process steps.Rather, the process steps may be performed in any order that ispracticable, including simultaneous performance of at least some steps.Although the disclosure has been described in connection with specificexamples, it should be understood that various changes, substitutions,and alterations apparent to those skilled in the art can be made to thedisclosed embodiments without departing from the spirit and scope of theinvention as set forth in the appended claims.

What is claimed is:
 1. A method comprising: extracting text contentdescribing an item from multiple web pages of a listing on a hostwebsite, the extracting comprising extracting keywords from a first webpage of the listing on the host website which describes the item and asecond web page of the listing on the host website which furtherdescribes the item; converting the extracted text content into an audioby combining the keywords from the first and second web pages of thehost website within a template that includes supplemental words relatedto the item to generate an audio script; and playing the generated audioscript in response to a selection of the item.
 2. The method of claim 1,wherein the method further comprises monitoring the host website via acomparison search website, and the playing comprises playing thegenerated audio script in response to a selection of the item on thecomparison search website.
 3. The method of claim 1, wherein thegenerated audio script comprises the keywords extracted from the firstand second web pages interspersed among the supplemental words relatedto the item.
 4. The method of claim 1, wherein the item comprises aproperty listing on at least one of a rental site, a travel site, and anaccommodation site.
 5. The method of claim 4, wherein the convertingcomprises collecting keywords extracted from a description of theproperty and a geographic location of the property, and inserting theextracted keywords into the template to generate the audio script. 6.The method of claim 1, wherein the extracting comprises extractingkeywords from user reviews posted on one of the first or the second webpages.
 7. The method of claim 1, wherein the combining comprisesinserting the extracted keywords into empty spaces within the templatethat are to be filled-in with the extracted keywords.
 8. A computingsystem comprising: a memory storing instructions; and a processorcommunicably coupled to the memory and configured to: extract textcontent describing an item from multiple web pages of a listing on ahost website, where the extracting comprises extracting keywords from afirst web page of the listing on the host website which describes theitem and a second web page of the listing on the host website whichfurther describes the item; convert the extracted text content into anaudio by combining the keywords from the first and second web pages ofthe host website within a template that includes supplemental wordsrelated to the item to generate an audio script; and play the generatedaudio script in response to a selection of the item.
 9. The computingsystem of claim 8, wherein the processor is further configured tomonitor the host website via a comparison search website, and play thegenerated audio script in response to a selection of the item on thecomparison search website.
 10. The computing system of claim 8, whereinthe generated audio script comprises the keywords extracted from thefirst and second web pages interspersed among the supplemental wordsrelated to the item.
 11. The computing system of claim 8, wherein theitem comprises a property listing on at least one of a rental site, atravel site, and an accommodation site.
 12. The computing system ofclaim 11, wherein the processor is configured to collect keywordsextracted from a description of the property and a geographic locationof the property, and insert the extracted keywords into the template togenerate the audio script.
 13. The computing system of claim 8, whereinthe processor is configured to extract keywords from user reviews postedon one of the first or the second web pages.
 14. The computing system ofclaim 8, wherein the processor is configured to insert the extractedkeywords into empty spaces within the template that are to be filled-inwith the extracted keywords.
 15. A non-transitory computer-readablemedium comprising instructions that when executed by a processor cause acomputer to perform a method comprising: extracting text contentdescribing an item from multiple web pages of a listing on a hostwebsite, the extracting comprising extracting keywords from a first webpage of the listing on the host website which describes the item and asecond web page of the listing on the host website which furtherdescribes the item; converting the extracted text content into an audioby combining the keywords from the first and second web pages of thehost website within a template that includes supplemental words relatedto the item to generate an audio script; and playing the generated audioscript in response to a selection of the item.
 16. The non-transitorycomputer-readable medium of claim 15, wherein the method furthercomprises monitoring the host website via the comparison search website,and the playing comprises playing the generated audio script in responseto a selection of the item on the comparison search website.
 17. Thenon-transitory computer-readable medium of claim 15, wherein thegenerated audio script comprises the keywords extracted from the firstand second web pages interspersed among the supplemental words relatedto the item.
 18. The non-transitory computer-readable medium of claim15, wherein the item comprises a property listing on at least one of arental site, a travel site, and an accommodation site.
 19. Thenon-transitory computer-readable medium of claim 15, wherein theconverting comprises collecting keywords extracted from a description ofthe property and a geographic location of the property, and insertingthe extracted keywords into the template to generate the audio script.20. The non-transitory computer-readable medium of claim 15, wherein theextracting comprises extracting keywords from user reviews posted on oneof the first or the second web pages.